24 research outputs found
How algorithmic popularity bias hinders or promotes quality
Algorithms that favor popular items are used to help us select among many
choices, from engaging articles on a social media news feed to songs and books
that others have purchased, and from top-raked search engine results to
highly-cited scientific papers. The goal of these algorithms is to identify
high-quality items such as reliable news, beautiful movies, prestigious
information sources, and important discoveries --- in short, high-quality
content should rank at the top. Prior work has shown that choosing what is
popular may amplify random fluctuations and ultimately lead to sub-optimal
rankings. Nonetheless, it is often assumed that recommending what is popular
will help high-quality content "bubble up" in practice. Here we identify the
conditions in which popularity may be a viable proxy for quality content by
studying a simple model of cultural market endowed with an intrinsic notion of
quality. A parameter representing the cognitive cost of exploration controls
the critical trade-off between quality and popularity. We find a regime of
intermediate exploration cost where an optimal balance exists, such that
choosing what is popular actually promotes high-quality items to the top.
Outside of these limits, however, popularity bias is more likely to hinder
quality. These findings clarify the effects of algorithmic popularity bias on
quality outcomes, and may inform the design of more principled mechanisms for
techno-social cultural markets
Construction and re-construction of identities: A study of learners' personal and L2 identity
The indispensable role of identity in language learning has recently attracted considerable attention among SLA scholars. Consequently, the current mixed-methods classroom-based study investigated whether the implementation of intercultural movie clips could contribute to improving the personal identity, and have a positive impact on L2 identity of participants in the English as a foreign language (EFL) context of Iran. To this end, two intact classes were assigned to the control and experimental group, each containing thirty students. This quasi-experimental study was implemented on the pre-test post-test equivalent-group design. Drawing on quantitative and qualitative analysis, using two questionnaires and a semi-structured interview, the results indicate that positive changes took place in the personal and second language identity of the participants. More specifically, they moved from a closed community of practice in which self was seen from one horizon to an intercultural community of practice in which others were seen besides self. The changing community provided by movie clips had an impact on the participants' views and trends. Thus access to new social, cultural, and linguistic resources resulted in the adoption of new identities. Indeed, teachers and educators should know that language can be considered as a site for the construction of self-identification or group affiliation since language is a key element in identity formation and identity is a sense of self or sense of belonging
A Linear Classifier Based on Entity Recognition Tools and a Statistical Approach to Method Extraction in the Protein-Protein Interaction Literature
We participated, in the Article Classification and the Interaction Method
subtasks (ACT and IMT, respectively) of the Protein-Protein Interaction task of
the BioCreative III Challenge. For the ACT, we pursued an extensive testing of
available Named Entity Recognition and dictionary tools, and used the most
promising ones to extend our Variable Trigonometric Threshold linear
classifier. For the IMT, we experimented with a primarily statistical approach,
as opposed to employing a deeper natural language processing strategy. Finally,
we also studied the benefits of integrating the method extraction approach that
we have used for the IMT into the ACT pipeline. For the ACT, our linear article
classifier leads to a ranking and classification performance significantly
higher than all the reported submissions. For the IMT, our results are
comparable to those of other systems, which took very different approaches. For
the ACT, we show that the use of named entity recognition tools leads to a
substantial improvement in the ranking and classification of articles relevant
to protein-protein interaction. Thus, we show that our substantially expanded
linear classifier is a very competitive classifier in this domain. Moreover,
this classifier produces interpretable surfaces that can be understood as
"rules" for human understanding of the classification. In terms of the IMT
task, in contrast to other participants, our approach focused on identifying
sentences that are likely to bear evidence for the application of a PPI
detection method, rather than on classifying a document as relevant to a
method. As BioCreative III did not perform an evaluation of the evidence
provided by the system, we have conducted a separate assessment; the evaluators
agree that our tool is indeed effective in detecting relevant evidence for PPI
detection methods.Comment: BMC Bioinformatics. In Pres
Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters in the world economy
Groups of firms often achieve a competitive advantage through the formation
of geo-industrial clusters. Although many exemplary clusters, such as Hollywood
or Silicon Valley, have been frequently studied, systematic approaches to
identify and analyze the hierarchical structure of the geo-industrial clusters
at the global scale are rare. In this work, we use LinkedIn's employment
histories of more than 500 million users over 25 years to construct a labor
flow network of over 4 million firms across the world and apply a recursive
network community detection algorithm to reveal the hierarchical structure of
geo-industrial clusters. We show that the resulting geo-industrial clusters
exhibit a stronger association between the influx of educated-workers and
financial performance, compared to existing aggregation units. Furthermore, our
additional analysis of the skill sets of educated-workers supplements the
relationship between the labor flow of educated-workers and productivity
growth. We argue that geo-industrial clusters defined by labor flow provide
better insights into the growth and the decline of the economy than other
common economic units
Testing extensive use of NER tools in article classification and a statistical approach for method interaction extraction in the protein-protein interaction literature
We participated (as Team 81) in the Article Classification (ACT) and Interaction Method (IMT) subtasks
of the Protein-Protein Interaction task of the Biocreative III Challenge. For the ACT we pursued an extensive
testing of available Named Entity Recognition (NER) tools, and used the most promising ones to extend our
the Variable Trigonometric Threshold (VTT) linear classifier we successfully used in BioCreative II and II.5. Our
main goal was to exploit the power of available NER tools to aid in the document classification of documents
relevant for Protein-Protein Interaction. We also used a Support Vector Machine Classifier on NER features for
comparison purposes. For the IMT, we experimented with a primarily statistical approach, as opposed to a deeper
natural language processing strategy; in a nutshell, we exploited classifiers, simple pattern matching, and ranking
of candidate matches using statistical considerations. We will also report on our efforts to integrate our IMT
method sentence classifier into our ACT pipeline
Construction and re-construction of identities: A study of learners’ personal and L2 identity
The indispensable role of identity in language learning has recently attracted considerable attention among SLA scholars. Consequently, the current mixed-methods classroom-based study investigated whether the implementation of intercultural movie clips could contribute to improving the personal identity, and have a positive impact on L2 identity of participants in the English as a foreign language (EFL) context of Iran. To this end, two intact classes were assigned to the control and experimental group, each containing thirty students. This quasi-experimental study was implemented on the pre-test post-test equivalent-group design. Drawing on quantitative and qualitative analysis, using two questionnaires and a semi-structured interview, the results indicate that positive changes took place in the personal and second language identity of the participants. More specifically, they moved from a closed community of practice in which self was seen from one horizon to an intercultural community of practice in which others were seen besides self. The changing community provided by movie clips had an impact on the participants’ views and trends. Thus access to new social, cultural, and linguistic resources resulted in the adoption of new identities. Indeed, teachers and educators should know that language can be considered as a site for the construction of self-identification or group affiliation since language is a key element in identity formation and identity is a sense of self or sense of belonging